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Abstract 

The physalis method is suitable for the simulation of flows with suspended spherical 
particles. It differs from standard immersed boundary methods due to the use of a local 
spectral representation of the solution in the neighborhood of each particle, which is used to 
bridge the gap between the particle surface and the underlying fixed Cartesian grid. This 
analytic solution involves coefficients which are determined by matching with the finite- 
difference solution farther away from the particle. In the original implementation of the 
method this step was executed by solving an over-determined linear system via the singular- 
value decomposition. Here a more efficient method to achieve the same end is described. The 
basic idea is to use scalar products of the finite-difference solutions with spherical harmonic 
functions taken over a spherical surface concentric with the particle. The new approach is 
tested on a number of examples and is found to posses a comparable accuracy to the original 
one, but to be significantly faster and to require less memory. An unusual test case that 
we describe demonstrates the accuracy with which the method conserves the fluid angular 
momentum in the case of a rotating particle. 
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1. Introduction 

Sediment transport, settling suspensions, fluidized beds and porous media are but a 
few examples of fluid flows with fixed or moving particles. As these examples show, such 
flows are frequently encountered and of great importance in many problems of scientific and 
technological relevance. From a computational perspective, they pose a major challenge as 
the particles constitute a very complex and frequently time dependent internal boundary 
which needs to be adequately resolved for a bona fide simulation. 

Until very recently the only practical approach was the Eulerian-Lagrangian point-particle 
model, in which the particles are assimilated to mathematical points contributing extra forces 
to the fluid, while their motion is found by integrating in time a dynamic equation in which 
the fluid forces are modelled rather than deduced from first principles. The last decade has 
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seen the development of methods capable of dealing with the finite extent of the particles. 
The most successful such method so far is the one developed by Uhlmann [l|, 0], which has 
been also used by Lucci et al. [3]. 

An alternative approach, dubbed physalis, first described in 0, 1^ and, more fully, in jfj|, 
differs from standard immersed boundary methods in its reliance on a local spectral repre- 
sentation of the solution in the neighborhood of each particle; a brief description is provided 
in section [2j Advantages of this method are the exact satisfaction of the no-slip condition 
at the particle surface, the great simplification of the calculation of the hydrodynamic forces 
and couples, and the avoidance of the complex issues arising from the lack of geometrical 
conformity between the curved particle boundary and the underlying fixed Cartesian grid. 
Furthermore, the use of a local spectral representation of the solution permits one to describe 
the effect of each particle with fewer degrees of freedom than conventional finite-difference- 
based methods. As a consequence, the grid resolution can be kept relatively low without 
compromising the accuracy of the solution. 

physalis has been thoroughly validated in the original papers and it has proven its 
value in a number of applications: the sedimentation of 1,024 particles the turbulent 
flow around a fixed particle [8], the flow induced by a rotating particle [i| and over a porous 
medium 111] . An interesting variant which uses artificial compressibility to generate a 



time-stepping pressure equation has been described and used by Perrin and Hu [12j, Il3 |. 

The basic idea of physalis is to use an analytic local solution of the modified (Navier-) 
Stokes equation as a bridge between the particle boundary and the finite-difference solution. 
The local analytical solution contains undetermined coefficients which are found iteratively 
by matching the finite-difference solution with the local one. 

The significant improvement described in this paper concerns the calculation of the ex- 
pansion coefficients, which was carried out using the singular-value decomposition in the 
original implementation. We show how this step can be executed by taking suitable scalar 
products with a significant saving in execution time. The purpose of this paper is to describe 
the implementation of this new method and its verification. 

2. General framework 

We briefly review here the PHYSALIS method for particle simulation, as originally intro- 
duced for cylindrical particles 0, \^ , and later extended to spherical particles p . 

The objective is to simulate, on a fixed Cartesian grid, incompressible fluid flow around 
spherical particles, while obeying the no-slip and no-penetration velocity boundary-conditions 
on the particle surface. To this end it is necessary to solve the incompressible Navier-Stokes 
equations, 
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where U, p, p, and g respectively denote the velocity, density, pressure, and gravity fields. 

As with other immersed-boundary methods, since the particles do not conform to the 
computational grid, the boundary conditions imposed by the particles must be transferred to 
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the adjacent nodes. The physalis method enables this transferral to be done in a spectrally 
accurate manner. 

To proceed, we transform the above equations to a frame in which the particle is at rest. 
(In the case of more than one particle this step is carried for each particle.) The fluid velocity 
u in the particle rest-frame is related to U, the velocity in the laboratory frame, by 



U = u + w + f2xx, 



(3) 



where w and O respectively represent the linear and angular velocities of the particle, and x 
is the position with respect to the particle center. The momentum equation in the particle 
rest frame is given by 
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and that of continuity is given by 

Now let 



Vp + pV 2 u + pg - p 



V-u = 0. 



w + f2xx + fix (f2 x x) 



u 



where r 



(4) 
(5) 

m •; • r •• r •• - (6) 

and a represents the particle radius. Then, if we introduce the modified 
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velocity and pressure fields 

u = u — u n and p 
we can write the rest-frame momentum and continuity equations respectively as 



p-p n , 
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Since the term within the square brackets vanishes at the particle surface, it will be small 
in its immediate vicinity. Therefore, in this region, p and u will approximately satisfy the 
Stokes equations, given by 



Vp + pV 2 u = and V 
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for which Lamb [14j| developed an exact general solution for flow past a sphere. It should be 
stressed that the method is intended for application to finite-Reynolds-number flows. The 
procedure may be seen as a linearization of the Navier-Stokes equations about the solid-body 
motion which the fluid in contact with the particle surface executes at any Reynolds number 
by virtue of the no-slip condition. 

The flow field is expanded in terms of vector spherical harmonics (allied to the stan- 
dard scalar spherical harmonics), which form a complete basis for square- integrable vector 
functions on the sphere [l5j|. The velocity u is given by 14. 16 
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and the pressure p by 

oo 

p=7Eft. (12) 

n=— oo 

where the p„, n , and \ n are solid harmonics of order n: 

Pn 
4>n 
Xn 

Here p nm , (p nm , Xnm are undetermined dimensionless coefficients and the functions Y™ are 
normalized spherical harmonics which are shown in detail in the Appendix. The regular 
harmonics, corresponding to non-negative n, represent a general incident flow, while the 
singular harmonics (negative n) represent the disturbance field due to the particle. Although 
the coefficients {p nm , (p nm , x nm } are in general complex, the relations {p nm , nm , x nm \ = 
( — l) m {p n -m,4 ) n-m,Xn-rn} ( m which the overlme denotes the complex conjugate) obviate 
the need for dealing with complex numbers and the entire calculation can be carried out in 
the real domain. 

The central observation at the root of the physalis method lies in the recognition that, 
in the immediate neighborhood of the particle, the velocity and pressure fields must be 
describable as in ( TTTT) and ( 1T2|) . Thus, provided the coefficients are known, ( [111 and (112!) can 
be used to "transfer" the boundary conditions from the particle surface to a cage of adjacent 
grid nodes enclosing the particle. After this step, the computation is carried out only on the 
nodes of the fixed Cartesian grid and the complexity deriving from the geometric mis-match 
between the particle boundary and the grid is eliminated. 

With this background, we can now describe the algorithm: 

1. By matching a provisional finite-difference solution (e.g., that at the previous time 
step) to the explicit representations ffTTT) . ffT2l and related ones (see below) at the cage 
nodes, find provisional values of the expansion coefficients; 

2. Use these provisional values to generate new velocity boundary conditions at the cage 
nodes; 

3. Solve again the Navier-Stokes equations on the finite-difference grid using these bound- 
ary conditions; 

4. Repeat to convergence. 

It is important to note that the coefficients appearing in (fT3|) embody the physics of the 
interaction of the particle with the fluid and contain therefore important information. For 
example, the hydrodynamic force, F, and couple, L, acting on the particle can be expressed 
in terms of the n = 1 coefficients as 

F = pv(w - g) + 6np,a& + 7r/ia 3 P (14) 
L = SvpX. + pa 2 Vn, (15) 

where v = |vra 3 is the volume of the particle, and we have defined <E» = {4> r n , <p l n , <fii ), 
P = (Pn>Pii>Pio)> and X = (xn; Xn, Xio)> with superscripts r and i denoting the real and 
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imaginary parts. Thus, unlike other immersed boundary methods, once the coefficients are 
known the force and couple are known as well with no need for additional calculations. 

As discussed in the following section, the specific contribution of the present paper con- 
cerns the first step, that of deducing from a given velocity and pressure field the values of 
the coefficients. 



3. Computing the expansion coefficients 

Away from the particle(s), the Navier-Stokes equations (JTJ) are solved on a staggered grid 
by a standard finite-difference method as described in jfjj. This reference also explains in 
detail the manner in which the cage of Cartesian nodes enclosing the particle is constructed. 
Let with K — 1, 2, . . . , N be the cage nodes. In the original algorithm, the coefficients 
are calculated by requiring that the pressure field given by f lT2|) and the vorticity field a> 
obtained from ( ITT]) reproduce the finite-difference results. In other words, one forms a linear 
system involving the coefficients as unknowns, e.g. 

oo n 

P=^H (v) n £ PnmY™(e K ,tp K ) = p(x K )-p n (x K ) , (16) 

n=— oo m=—n 

in which x# = (tr, 8k, 4>k), the left-hand side is as given in (TT21 and contains the unknown 
coefficients, while the right-hand side is the finite-difference result at the cage nodes. The 
vorticity field is treated similarly. As explained before, the coefficients thus obtained are 
then substituted into (fTTj) to generate velocity boundary conditions at the cage nodes. 

In this way, four scalar equations are obtained for each cell comprising the cage. Taken 
together over the cage, this set of equations forms an overdetermined linear system for the 
unknown coefficients which is solved by means of the singular value decomposition (SVD) 
algorithm, i.e., in essence, in a least-squares manner. Although the method is stable, the 
typically large systems that result (e.g., a matrix of 2400 x 46 for a grid-resolution of 8 cells 
per radius, and with the analytical expressions truncated at N c = 3) prompted us to search 
for a more efficient procedure. 

There is, however, an alternate and perhaps more natural way to obtain the coefficients 
which relies on the orthogonality property of the spherical harmonics, for example 

(Y t k , Y?) = r sin Odd Y^Y^dtp = 6 ln 6 km . (17) 
Jo Jo 

In the spirit of this approach, (1TB]) is replaced by 

0T>P) = {Y™,p-P n ) • (18) 
or, upon substituting (fT2"j) for p and using ffTT]) and recalling that Y™ n _ x = Y T 
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0C,p) = ^(p nm + p_ n _^ m ). (19) 
cr 

It is shown in the Appendix that the coefficients of negative order can be expressed in terms 
of those of positive order to find 
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This property of the spherical harmonics to single out individual coefficients is analogous to 
the Fourier transform and, as discussed in section HJ there exists a "Fast Spherical Transform" 
(FST) 0, incorporating and scaling as the Fast Fourier Transform. As will be shown below, 
this procedure proves considerably faster than the SVD approach of [6], and comparably 
accurate. 

The other coefficients can be determined by a similar procedure. However there are 
different possible choices of field variables to use for this purpose, which we now discuss. 

In the derivations we will use two orthogonality properties of the vector spherical har- 
monics similar to ( fT71) . namely (see e.g. Ref. 15]) 

(rVY l k ,rVY™) = r 2 f sin 9d9 [ VF* • W n m ^ = n(n + l)8 ln 8 km , (21) 
Jo Jo 
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It is found that, whatever field one may consider 
Xnm appears separately. 



; x x vr;)^ = n (n + i)<ww (22) 

(p nm and p nm appear coupled, while 



3.1. Computing <j) nm and p nm 

In general it is desirable to choose pressure as one of the field variables used to calculate 
the coefficients p nm and 4> nm , since this field is smoother than the velocity field and, addi- 
tionally, does not vanish near the particle. Thus, ( 1201) is one of the equations that we use to 
determine the coefficients. 

A combination of any two out of three other scalar products, involving the radial com- 
ponent of velocity u r = e r ■ u; the transversal components of velocity iii = u — u r e r ; or the 
transversal components of vorticity uj± = uj — u r e r may be used to solve for (p nm and p nm . 
We will present each of these options below, and, in section EJ demonstrate their respective 
performance. 

1. Radial component of velocity: we take the scalar product of Y™ and u r as found from 
( ITT1) . obtaining 
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where we have used f|T7|) and the relations 
with negative index. 

Transversal components of velocity: By using (1211) in fTTTl) we find 
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where again we have used the relations (jUJ) and ( 149|) to eliminate the coefficients with 
negative index. 
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3. Transversal components of vorticity: Upon taking the curl of the velocity given by (TTTj) 
and projecting on the transversal direction we find 
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where for brevity we have defined the operator V' as, e.g., 
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(26) 



Upon taking the scalar product of this equation with x x VY^ fe and using (|22p we obtain 



(xxvr;^i) = n 



i + 



2n- 1 /a\2n+i 
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(27) 

where again (HHj) and (H9l) have been used to eliminate the coefficients with negative 
index. 

Having in hand equation (|20|) , along with either ( 123]) , ( )24|) , or (|27|) , we can now solve for the 
two unknowns, nm and p nm . 

3.2. Computing Xnm 

Three scalar products can be formed that involve Xnm- 

1. Transversal components of velocity: projecting upon x x VF™, rather than r W, 
as done in equation (12^1) . we obtain 
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where we have used equation (|50|) . 
2. Radial component of vorticity: u r is given by 
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(30) 



3. Transversal components of vorticity: projecting uj± upon rVF™, rather than xx VFJ" 
as was the case in equation fT27]) . we obtain 

,2 r / /~i \ 2n+ll / rr> \ n-l 
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Any one of relations (1281 . (|30|) . and (I3T1) . may be used to obtain Xnm- 
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4. Implementation 



The second-order accurate discretization of the Navier-Stokes equations, along with the 
construction of the cage and the manner in which the particle boundary conditions are 
applied at the cage surface, are described in detail in [6|. We will therefore focus our 
discussion here on those aspects of the present methodology that diverge from the original 
version, namely, the calculation of the expansion coefficients. 

The scalar products appearing in f fl8|) and the following equations amount to integrations 
carried out over a spherical surface concentric with and enclosing the particle. In general, 
therefore, the quadrature nodes do not coincide with the finite-difference grid points. We 
thus require a procedure for approximating the values of uj, u, and p at the quadrature 
nodes on the basis of those on the finite-difference grid. To this end we adopt two separate 
approaches: interpolation, and local linearization. For the reasons explained below, we limit 
the latter to those scalar products involving p and in relations ( )20l) . ( l24l) . and (|28l) . 

4-1. Interpolation 

Since the spatial and temporal discretization is second-order accurate, an interpola- 
tion of the same order or higher is appropriate. We experimented with both cubic splines 
and quadratic Lagrange-polynomials, both implemented in the GNU Scientific Library [18| . 
These proved comparably accurate, but the Lagrange interpolation is considerably more 
expedient and was hence adopted. 

Given the point x = (x,y, z) at which the value of e.g. the pressure p is required, the 
interpolation proceeds as follows: 

1. Let Xjj^ be the cell containing x; 

2. Centered on, and including this cell, construct a cube of 27 cells having coordinates 
Xj +aj+/3jfc+7 , with a, ft, 7 = 0, ±1, at which p is known; 

3. Arbitrarily choosing to interpolate first on the y-z planes of the cube, we interpolate 
p(xi, yj, Zk) in z for each yj and Xi, and evaluate the results at z — z, yielding p(xi, yj, z); 

4. Interpolate p(x iy yj, z) in y for each Xj, and evaluate at y — y, yielding p(xi, y, z); 

5. A final interpolation in x is performed and evaluated at x = x, resulting in the desired 
approximation of p(x). 

This three-dimensional, second-order accurate interpolation therefore requires 13 one- dimensional 
interpolations in total. 

4-2. Local linearization 
For pressure, we write 



where x is defined as above, and x c = x ijjfc is the cell-center which, given the staggered-grid 
formulation, is where p resides. The gradient Vp is approximated with second-order accurate 
(at x c ) central differences, using adjoining cells. 
For velocity, the corresponding expression is 




(32) 




where 




dui 



(33) 
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Since the velocity is denned at face centers, we approximate u(x c ), to second-order accuracy, 
with the average across the cell. The gradients are calculated with central differences. The 
diagonal components Ju(x c ) reside at the cell-center. As the remaining components reside 
at edge-centers, we approximate them to second-order accuracy with the average across the 
cell. 

While the quadratic interpolation has higher accuracy than the linearization, the latter 
has some advantages that make it worth exploring. Firstly, note that the estimate u(x) is 
divergence-free to the same accuracy as the velocity field at the nodes, since 

V • u(x) = Jii(x c " 



dxi 



(34) 



while the same is not necessarily true of the interpolation. Secondly, the operation count 
involved with the linearization is lower than for the interpolation. Lastly, the spatial extent 
of the region involved with the computation is reduced, as only the six adjacent, face-sharing 
cells need be considered. 

The same approach applied to the vorticity would only give a zero-order accuracy. An 
effort to improve this low accuracy, on the other hand, would require a comparatively much 
larger set of cells. This is the reason why we did not pursue this method for the vorticity 
options. 

With these procedures for sampling the flow field at the quadrature points on the sphere, 
the next task is to evaluate the scalar products. The present integrands (e.g., in equa- 
tion (j!7p ) are oscillatory and increasingly so with higher n. Particular care must therefore 
be exercised in distributing the quadrature points in order to maintain stability. Given the 
homogeneity of the azimuthal (</>) direction, a uniform grid (f>k — 2nk/2N, is appropriate, 
where k = 0, . . . , 2N — 1 and 2N is the number of quadrature points in <p. The natu- 
ral distribution in the polar (6) direction, spanned by the associated Legendre polynomials 
(cf. equations (H6|) and fll?]) ), is 8j = n(2j + 1)/4N, where j has the same range as k. 

The scalar product on the sphere can essentially be considered as the product of discrete 
Fourier and Legendre transforms in the azimuthal and polar directions, respectively. Using 
the quadrature points described above, the cost of a direct numerical evaluation of these 
transforms scales as 0(N 4 ). Faster alternatives exist, however. The Fast Fourier Transform 
can naturally be used in the azimuthal direction, and there additionally exists a Fast Legen- 
dre Transform, due to [13] ■ The combination of these fast algorithms reduces the cost scaling 



to 0(N \og(N)). The authors of [17J have released a C-implementation of their method, 



the SpharmonicKi^, which we embed in our code 

Lastly, we note that the Fast Spherical Transform of 17| includes only the spherical 
harmonics Y™, and not gradients thereof. To compute the scalar products involving Vl^ 71 
(e.g., equation (1281) ). we express these in terms of the Y™\ 

dY m 

~W = zmrr ' (35) 

U1 n " L Y m _|_ n ' m r -i<t>Y rn + l (W\ 

89 tan(9) n + N n , m+1 e * n ' W 



1 See http://www.cs.dartmouth.edu/~geelong/sphere/ 
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It should be noted that, in spite of the explicit appearance of the imaginary unit, the actual 
computations are carried out in the real domain only. 



5. Results 

In this section we contrast the various aspects of the singular-value decomposition (SVD) 
and the scalar product (SP) version of physalis. In particular we examine relative speed 
and accuracy, and address the question of which SP is best suited. 

5.1. Comparing the speed of the SVD and SP 

As mentioned in sectional [§[ compute the expansion coefficients by matching the vortic- 
ity and pressure, based respectively on the analytical expansions in equations (TTTj) and (TT2l . 
to those from the finite-difference field, resulting in a linear system, Ac = b. A is a rectan- 
gular matrix of size L x M, where L = + N p and M = 3N C (N C + 2) + 1, where n = N c 
is the degree at which the summations in equations ( TTTj) and ( fl2l) are truncated; c is the 
vector containing the unknown expansion coefficients; and b contains the values from the 
finite-difference field. N u and N p respectively represent the number of vorticity and pressure 
nodes comprising the cage. 

The commonly used SVD algorithm can be shown [19] to scale asymptotically as 0(LM 2 + 
M 3 ). In contrast, the cost of the FST scales as 0(N 2 log(iV)), where 2N x 2N is the number 
of samples. If the flow variables u>, u, and p are band-limited so that n < B, then N = B will 
suffice to calculate the coefficients without error (this is analogous to the Nyquist- Shannon 



sampling theorem [2CJ), assuming that the samples are error- free [17]]. In the general case 
we do not know B and, therefore, to prevent aliasing, take N = 3N C . In terms of N c , the 
cost of the FST therefore scales similarly. 

As shown in N c = 4 suffices to adequately describe flows with particle Reynolds 
numbers up to about 100, with a grid resolution of a/h = 8, where h is the grid spacing. 
As such, the asymptotic scalings above are not as relevant as empirical measurements of the 
cost associated with a single calculation of the coefficients. Such measurements are shown in 
figure [Lj both as a function of N c (with fixed a/h) and of the grid resolution a/h (with fixed 
N c ). The SP approach proves over two orders of magnitude faster than the SVD at all N c 
shown. We also note that the cost of both the interpolation and the linearization, described 
in the previous section, is independent of the grid resolution as the number of quadrature 
points is only determined by iV c . This is apparent from figure [IJ right, while the cost of the 
SVD grows strongly with a/h. 

While the SP has been shown to be significantly faster than the SVD, there are some 
caveats of note. Firstly, the most expensive portion of the overall calculation is the solution 
of the Poisson equation for pressure. The speed-up in the calculation of the coefficients, while 
impressive, will therefore only affect the overall calculation fractionally, and not proportion- 
ally. Notwithstanding, the overall gains can be significant, particularly in the presence of 
many particles. Secondly, we note that the SP is only faster for moving particles. Since 
the cage of stationary particles is fixed, the SVD needs only to be computed once, and the 
resulting decomposition can be re-used, resulting in comparable speed with the SP. If the 
particle is moving, on the other hand, the cage moves with it (once the particle moves a 
distance of h/2 from the center of the cage, a new cage is constructed 0]), requiring a call 
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to the SVD. Although this does not happen at every time step, the effect can be nonetheless 
significant as shown in section 15.51 

5.2. Recovery of assigned coefficients in Stokes flow 

The simplest test of the accuracy with which the coefficients are calculated can be carried 
out on a flow field generated analytically by using the expressions f ill I) and (|12p . We use 
these relations with arbitrarily chosen coefficients to assign the fields at the grid points. We 
choose N c = 4 (for a total of 3N C (N C + 2) + 1 = 73 coefficients assigned), and a spatial 
resolution of a/h = 10. The SPs are computed at r/a = 1.5. Given that the entire field 
in this case is Stokes, the solution is, in theory, insensitive to the choice of r/a. However, 
the further out one is from the surface of the particle, the finer the Cartesian grid becomes 
relative to the scale (r / a) on which the Y™ vary. Later we will demonstrate how the accuracy 
varies with r/a for flows in which the Stokes region has a finite extent. 

To quantify the accuracy with which the assigned coefficients are recovered from the 
assigned fields, we define an error metric as 



/ , \Pnm Pnm\ i (37) 
m=—n 

where p^ m and p nm respectively denote the assigned and recovered values; and E x are 
defined in the same way. 

Figure [2] shows the results so obtained. Generally, the recovery by all methods is reason- 
able, with a maximum error level of E p = 6%, achieved by the SVD. In fact, all of the SPs 
prove more accurate than the SVD under the E p error metric. In contrast the SVD incurs a 
larger error at higher values of n. This instability, presumably associated with the numerical 
implementation of the SVD, is particularly apparent in E p , but can also be noted in and 




11 



E x . In terms of E p and E^, the SP involving the interpolated uj_ proves most accurate, 
followed by those involving u r and and the linearization estimate of (recall, from 
section HI that linearization is only used with and p). The interpolated uj_ also performs 
best in terms of E x , but now followed by u)±, u r , and the linearized u^. 

Although the reasons for the differing performance, in terms E p , E^, and E x , are some- 
what unclear, we can make some observations. Firstly, we note that the interpolated Uj_ 
is likely more accurate than u r due to its vector nature: the relevant SPs (equations ff24|) 
and (128]) ) involve two velocity components, presumably resulting in some cancellation of 
error. While the same is true of CJ±, it should be remembered that this is a derived quantity 
and some error will be associated with the differentiation of the velocity field. The results 
involving the linearized approximations in equations ( 132|) and ( 133]) are likely inferior to the 
interpolated ones due to first-order accuracy of the former. Stokes flow with N c = 4 contains 
relatively rapid variations, which are better captured by the second-order method. This is 
evidenced by setting N c = 2, at which degree the two methods produce nearly the same 
results (not shown). A further confirmation is provided by the results of the next sections 
where the two methods show comparable performance in smoother flows in which only the 
lowest order coefficients (n < 2) prove significant. 

5.3. Drag on a sphere in a triply-periodic array 

In this section we examine the accuracy of the SVD and SPs in a pressure-driven flow 
through a triply-periodic array of spheres. As in p , we write the pressure gradient as 

Vp = Pe z + Vp, (38) 

in which p is periodic, e z is the unit vector in the flow direction z, and P is a constant. By 
applying a momentum balance in the flow direction at steady state, it can be shown that 
the force F = F z e z on the particle is given by 

F z = (l- /3)L 3 P, (39) 

where (3 = ^na 3 /L 3 . We use a periodic domain of size L = 4a, with the particle fixed at the 
center. The flow is initially stationary, at which time a pressure gradient = o?P/ [iv = 10 
is applied. This accelerates the flow until it reaches a steady state where the drag force on 
the particle matches that applied by the pressure gradient, resulting in a Reynolds number 
of Re = 2aU jv — 22.6, where U is the mean flow rate. The accuracy of the force calculation 
in equation ([141 can then be gauged by comparing with the exact result in equation ( I391) . 

Using a resolution of a/h = 8, we calculate the error |1 — F z /(1 — (3)L 3 P\ as a function 
of N c , with the SPs being computed at r/a = 1.25. Figure |3] show the results so obtained. 
Good results are obtained for all methods, with the error being less than 1% for iV c > 3, as 
was the case in 0]. Notable is the performance of SP involving uj±, suffering the lowest error 
out of all the methods, including the SVD. The two SPs involving now prove comparably 
accurate, with the interpolated one only having a slight advantage. This is likely due to the 
smoothness of the flow field, as the n = 1 coefficients are dominant. Incidentally, these are 
the coefficients involved with the force calculation, as can be seen from equation ([14]) . In 
contrast, the SP involving u r suffers the largest error of all. While all velocities vanish at 
the surface of the particle, the radial component does so more rapidly than the transversal 
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n 



Figure 2: Here we examine the accuracy with which the coefficients p nm , <j) nm and x„ m are recovered from a 
Stokes flow constructed by assigning coefficients in equations (fTTj) and (|12p . All coefficients are set simulta- 
neously and the error measured according to equation (|37|) . Parameters: N c = 4, a/h = 10, and SPs taken 
at r/a = 1.5. 



13 



0.03 



* — SVD 




Figure 3: Here we compute a momentum balance on a single sphere in a triply-periodic array of spheres 
with an imposed pressure gradient, and compare with the analytical result in equation (|39|) . as a function of 
N c . Parameters: L = 4a, a/h = 8, and SPs computed at r/a — 1.25. 



ones. This causes larger errors in the interpolation, as the signal to noise ratio is diminished. 
This explains why u r proves less accurate here than in the previous section, where the SPs 
were computed at r/a = 1.5. 

Since the SPs are computed at a given distance from the particle surface, this test case 
provides an opportunity to explore the extent of the Stokes region, at least for this flow 
configuration. As one moves closer to the surface, SPs involving velocity will at some point 
be increasingly in error as the velocity vanishes. The SPs involving vorticity and pressure 
will also suffer error in this region, as neither is continuous through the surface of the cage 
where the boundary conditions are applied. Thus, upon moving towards the cage, one will 
eventually interpolate through this discontinuity, resulting in error. The error will also grow 
as we move further from the particle and out of the Stokes region. 

Figure H] shows error calculations analogous to those in figure [3, but now as a function of 
r/a, with N c fixed at 3. The behavior of the different SPs with r/a is not uniform, with u:± 
having a minimum in error at r/a = 1.25, while the minimum for those involving velocities 
occurs close to r/a = 1.5. 

Lastly, we note that, as this test case is nearly axisymmetric in the vicinity of the particle, 
the coefficients Xnm are very small and as such we cannot draw conclusions as to which SP 
serves best to compute this. This is addressed in the next subsection. 
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Figure 4: Error computed as in figure El but as a function of the r/a at which the SPs are computed, with 
fixed N c — 3. The figure on the right shows a close-up view of the rectangular region on the left. 



5.4- Couple on a sphere in a triply-periodic array 

In the previous section we presented results demonstrating the accuracy with which the 
coefficient-calculation of the SVD and different SPs conserves the momentum of flow driven 
by a pressure gradient. In this section we perform an analogous set of calculations, but this 
time involving the conservation of angular momentum. This is a stringent test of which few 
examples, if any, can be found in the literature. 

A particle rotated at a constant rate ft will impart angular momentum to the surrounding 
fluid. In turn, in a finite or periodic domain, at steady state the couple at the surface of the 
particle must be balanced by the couple on the fluid on the outer boundary. The calculation 
proceeds as follows. 

We take the cross-product of x, the position vector relative to the particle center, with 
the momentum equation p]) at steady state and in the absence of gravity, and integrate over 
the domain 



P 

where 



/ x x V • (UU) alV = [ x x ( V ■ tr) dV, (40) 
Jv Jv 

fdUi dUA 

°a = -p s « + A* a- + or ■ ( 41 ) 



dxj dxi 



By the symmetry of the stress tensor we have 



xx (V • cr)^ = CijkXj— — = -^—{eijkXjO-ki) - e ijk a kj 



dxi dxi 

V-(xx<T)|i. (42) 



An application of the divergence theorem transforms equation fj40|) to 

p I x x U(U -n)dS = I x x (tr ■ n) dS + I x x (tr ■ n) dS, (43) 

iS */ S J Sr> 
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where S is the surface of the domain and S p that of the particle; n is the normal vector 
directed out of the fluid. We have dropped the integral over the particle surface in the 
left-hand side as the radial component of velocity vanishes there. For periodic boundary 
conditions, which we assume, the remaining integral in the left-hand-side also vanishes. 
Furthermore 



J Sp 

is just the hydrodynamic couple on the particle (with the minus sign due to the direction of 
the normal) and equation (jl3l) therefore becomes 



We take f2 = f2 2 e 2 so that L p = L p e z and Lj, = Lj,e 2 . We use a domain of size 4a, a 
grid resolution of a/h = 8, and a rotation rate of Re = a 2 VL z /v = 20. We compute the error 
|1 — Lb/L p \, as a function of N c , analogously with the previous section. Figure [5] shows the 
results. With the exception of lo±, all SPs prove more accurate than the SVD for N c > 2. 
At first glance, this seems at odds with the results of the previous section, where u)± was the 
most accurate SP. However, with Gj = Q z e z , 6)^ = and, therefore, uj± = uge 9 . Hence, the 
second component of this SP is lost. Further, a significant portion of ug is associated with 
the solid-body rotation of the particle, which gets subtracted out in the frame of the particle, 
decreasing the signal-to- noise ratio in the calculation of ug, which in turn causes higher error 
levels. In contrast, the SP associated with u r now performs best out of all candidates, which 
is also at odds with the results of the previous section. As in the previous section, the 
SPs involving the interpolated, and linearized, estimates of provide practically the same 
results. 

We also compute the above error for fixed N c = 3, but varying the r/a at which the SPs 
are computed. Figure M shows the results so obtained. As was the case in figure HI the error 
for different SPs does not behave uniformly in r/a. Generally, though, it seems that there 
is some leeway in the placement of r/a, with 1.1 < r/a < 1.5 providing reasonable results. 

The difference between the performance of the various SPs in this section and the previous 
section is instructive. Clearly, which SP will perform best depends on the flow conditions. 
If, for example, u r is very small, it will not deliver a reliable estimate of the Xnm (via 
equation ( 130]) ). On the whole, it seems that the combination of p and provides consistently 
good estimates across widely varying flow conditions. One would further suggest using 
the linearization over the interpolation, given the simplicity of the former and its other 
advantages described in section HJ 

5. 5. A falling particle 

The test cases in the previous sections were all for non-translating particles. We now 
describe some tests on a freely falling particle in a parallelepipedal container of length 20a 
and cross section 8a x 8a. The boundary conditions in the lateral and vertical directions 
are periodic and no-slip, respectively. The ratio of the particle density to the fluid density 
is Pp/p = 2, the kinematic viscosity is v = 0.2731 m 2 /s, and the acceleration of gravity has 
the standard value g ■ e 2 =-9.81 m/s 2 . The simulation is carried out with a/h = 6 and 8, 




(44) 




(45) 
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Figure 5: Here we compute the error in the calculation of couple on a particle rotating at constant velocity, 
as a function of N c . The results for SVD at N c = 8 are absent due to convergence failure. Parameters: 
L = 4a, aj h = 8, and SPs computed at r/a = 1.25. 
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Figure 6: Error in the calculation of couple on a particle rotating at constant velocity, as a function of the 
r/a at which the SPs are computed. The results for a>j_ at r/a = 1.8 are absent due to convergence failure. 
Parameters: L = 4a, a/h — 8, and N c = 3. 



respectively using N c = 2 and 3. The time-step in all cases was uAt/a 2 = 0.005, resulting in 
a CFL number of 0.45 at a/h — 8. In light of the previous results, we narrow our scope of 
SPs to those involving the linearization of ui and p, and those involving the quadratically 
interpolated u>± and p. 

The particle is released from rest at t — 0. If no provision is made, at t — the 
weight of the particle would be suddenly imposed on the fluid, thus generating strong force 
oscillations. To avoid this artifact we have ramped up the value of the acceleration of gravity 
in an exponential manner over a time < vt/a 2 < 0.1. 

Figure [7] shows the evolution of the Reynolds number Re = law /v. In an unbounded 
fluid the steady state Reynolds number predicted by the Schiller-Naumann correlation [21[ 
is 25, but, due to the lateral constraint, here it is close to 22.5 in all cases. Essentially, the 
effect of the constraint can be viewed as a blockage effect due to the image particles. While 
the steady-state Reynolds number varies slightly between the different methods and at the 
two different resolutions, the largest difference is less than 3%. 

Since the cage changes with time, for the reasons explained in section 15.1} this is also a 
good test case to examine the CPU-time used by particle-related operations. Figure M shows 
the cumulative time in seconds thus expended, as measured using an Intel 2.66GHz CPU. 
As expected, both SPs provide a significant speed-up, particularly for a/h = 8. For a/h = 6, 
the SPs have cumulatively cost roughly 14 s at vt/a 2 = 1, while the SVD costs roughly 
109 s, larger by a factor of 8. The cost associated with the SVD increases by a factor of 6 
upon increasing the resolution to a/h = 8, while that of the SPs only increases by a factor 
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of 2 (recall that N c = 2 and 3 respectively for a/h = 6 and 8). This is consistent with the 
time measurements demonstrated in figure HJ The cumulative savings in the cost of particle- 
related operations would be proportionally increased in the presence of multiple particles. 
Lastly, the cost associated with the SP based on the linearized velocity and pressure is 
roughly 50% less than that based on the interpolated transversal vorticity and pressure. 

It is generally the case with immersed-boundary methods that force oscillations are pro- 
duced as the particle translates with respect to the underlying grid. It was shown in 0] that 
physalis is not immune from this problem, and it is therefore of interest to examine the 
performance of the SPs also in this respect. 

Generally speaking, the origin of the force oscillations must be sought in the minor dis- 
continuities that the calculated force undergoes as the particle moves. The same mechanism 
is responsible for the force oscillations encountered upon a sudden release of the particle as 
mentioned before. A discontinuity in the hydrodynamic force on the particle, by the action- 
reaction principle, causes an impulsive force on the fluid, which responds with a pressure 
impulse. This is the reason why the artificial compressibility approach of 12j, [l3j suffers 
from this difficulty much less than the original physalis. 

In our method, at the end of each time-step, the position of the particle center is compared 
with that of the cage center and, if the difference is greater than h/2, the cage is displaced 
(gf. As a consequence, a different set of nodes is used to compute the coefficients at the 
next time step, and this causes an unavoidable small discontinuity in the force calculation 
even though the displacement of the cage is small enough that both the old and the new 
cages are entirely in the Stokes region. This mechanism leads one to expect that the force 
oscillations decrease with a refinement of the grid, as it was indeed observed in 0] and as 
will be shown presently. Another contributing factor may be the fact that the location at 
which the boundary conditions are applied on the fluid is different when the cage is moved. 
Furthermore, some of the grid points that were previously inside the cage become fluid 
points and need to be re- initialized as such, as described by (if. However, this is likely not 
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Figure 8: Cumulative time spent on particle-related operations. 



a significant source of force oscillations due to the iterations in the inner loop (steps 1-4 in 
section [2]) . 

Figure [9] shows the time evolution of the normalized force on the particle for a/h = 6 and 
8. It should be noted that, for clarity, the curves for the original SVD method and the SP 
with have been translated by ±0.1. The expected decrease of the amplitude of oscillations 
with increasing grid refinement is evident for all three methods, although the SVD curves 
show significant spikes even with the finer grid. The amplitude of the oscillations with both 
SP methods is smaller, does not present any spikes and exhibits a greater reduction on the 
finer grid. The greatest reduction is obtained by the SP with the linearized which thus 
is seen to perform well also from this point of view. 

5. 6. Taking the scalar products closer to the particle- surface 

In the above sections, the SPs were taken over a sphere with a radius r with r/a > l+ch/a 
with 1 < c < 4, where a is the radius of the particle. While this does not arise concerns in 
single-particle simulations, it might under circumstances in which particles are expected to 
be in close proximity (such as for flow through a tightly-packed porous medium) and even 
to collide. In this section we show that c > 1 is not an intrinsic limitation of our method 
and that even r/a = 1 is permissible, allowing inter-particle contact. 

As noted before, vorticity and pressure are not continuous through the cage. This is the 
reason why, for a given cage size R c /a and grid resolution, there exists a lower limit on c 
below which the calculation of vorticity and pressure becomes in error. While this is the case 
for both the interpolation and local linearization, the latter is less affected due the smaller 
number of cells involved, thus allowing a smaller c before the cage is breached. 

The constraint of operating outside the cage can be met either by using r/a appreciably 
greater than 1, as done before, or by decreasing R c /a as we now show. The latter option is 
permissible since the analytic expressions fill I) and (j!2p embody a smooth analytic continu- 
ation of the pressure and velocity fields inside the particle. Thus, the boundary conditions 
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a/h = 6,N c = 2 a/h = 8,N c = 3 




Figure 9: Evolution of normalized hydrodynamic force in the direction of gravity, for a particle falling 
from rest between two plates, with periodic boundary conditions in the directions normal to gravity. For 
readability, the curves representing the SVD and the SP involving lj± are respectively shifted by +0.1 and 
-0.1. 



on the cage may be applied on the inside as well. 

As shown in figure fTUl by taking R c /a = 0.9, the SP involving uj±_ gives excellent results 
even with r/a = 1. As the velocities vanish at r/a — 1, the relevant SPs are increasingly in 
error close to the surface, even with R c /a < 1. In situations where r/a < 1.05 is required, 
one would thus be advised to use u}±. Actually, in all the examples considered, except that 
of section I5.4[ this option gives excellent results and the associated computational overhead 
is only modestly larger than for the linearization method. 

6. Summary and conclusions 

In this work we have presented a new approach to calculating the unknown expansion 
coefficients in Lamb's solution for the near-particle flow field in the physalis method. In the 
previous approach of [6], the coefficients were obtained via the SVD solution of a set of linear 
equations formed by equating the analytical expansion, evaluated at the nodes of the finite- 
difference field around the particle, to the finite-difference field itself (cf. equation ( IT6l) ). In 
the new approach we take advantage of the orthogonality of the set of vector harmonics by 
taking suitable scalar products thereof with values obtained from the finite-difference field. 

We have described how the finite-difference field can be estimated on the spherical surface 
over which the scalar products are taken and further how the relevant integrals are computed 
efficiently via a combination of discrete Fourier and Legendre transforms on the sphere. 

There are several options for which fields should be used to calculate the coefficients. Our 
results suggest that the local linearization method based on the pressure and the transverse 
velocity (section I4.2[) is efficient and accurate provided the scalar product can be computed 
on a surface with a radius exceeding that of the particle by 1-2 mesh lengths. For simulations 
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Figure 10: Error for the same flow-configuration as in section T5.31 as a function of the r/a at which the SPs 
are taken, and the size of the cage, R c /a. Note that the SP involving d>j_ can be taken right at the particle 
surface, granted that the cage is inside the particle. 



where the particles are very close or can collide, a more accurate option would be the use of 
the transverse vorticity in addition to the pressure (sections 13.11 and 13 . 2f) . 

Via various examples involving stationary, rotating, and sedimenting particles, we have 
demonstrated that the new approach is more accurate, more stable in terms of force- 
oscillations, and significantly faster than the SVD approach of [6]. The speed-up is par- 
ticularly significant at higher Reynolds numbers, as h ~ Re~ 3 ^. As the number of points 
comprising the cage scales as Re 3 / 2 , the cost of the SVD scales as Re 3 ^ 2 N^, (cf. section lo"TTj) . 
While the same resolution is of course required when using the scalar product, the associated 
cost scales only as N 2 \ogN c . The speed-up reported in this paper are therefore likely to be 
further enhanced at higher Reynolds numbers. 
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Appendix 

The Y™, in equation (fl~3]) and others, are the orthogonal spherical harmonics, 

W 47T [n + 171)1 
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where 6 and ip are respectively the polar and azimuthal angles, and m — 0, ±1, . . . , ±n. For 
m > 0, the associated Legendre polynomials P™ are given by 

d m 

Pn^osO) = (-ir( S iner d{cose)m P n (cos9). (47) 

The constant N™ is included in order to render the Y™ orthonormal. 

The regular and singular solid harmonics can be related through the no-slip boundary 
condition on the particle surface. The resulting expressions are 

1 2n — 1 /a \ 2n+1 
= — n[p n + 2(2n+l)0 n ] - (48) 

2 n + 1 Vr / 
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MX 2n+l 

X-n-1 = - - Xn- (50) 
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